86 research outputs found
Facial Action Unit Detection Using Attention and Relation Learning
Attention mechanism has recently attracted increasing attentions in the field
of facial action unit (AU) detection. By finding the region of interest of each
AU with the attention mechanism, AU-related local features can be captured.
Most of the existing attention based AU detection works use prior knowledge to
predefine fixed attentions or refine the predefined attentions within a small
range, which limits their capacity to model various AUs. In this paper, we
propose an end-to-end deep learning based attention and relation learning
framework for AU detection with only AU labels, which has not been explored
before. In particular, multi-scale features shared by each AU are learned
firstly, and then both channel-wise and spatial attentions are adaptively
learned to select and extract AU-related local features. Moreover, pixel-level
relations for AUs are further captured to refine spatial attentions so as to
extract more relevant local features. Without changing the network
architecture, our framework can be easily extended for AU intensity estimation.
Extensive experiments show that our framework (i) soundly outperforms the
state-of-the-art methods for both AU detection and AU intensity estimation on
the challenging BP4D, DISFA, FERA 2015 and BP4D+ benchmarks, (ii) can
adaptively capture the correlated regions of each AU, and (iii) also works well
under severe occlusions and large poses.Comment: This paper is accepted by IEEE Transactions on Affective Computin
Fine-Grained Expression Manipulation via Structured Latent Space
Fine-grained facial expression manipulation is a challenging problem, as
fine-grained expression details are difficult to be captured. Most existing
expression manipulation methods resort to discrete expression labels, which
mainly edit global expressions and ignore the manipulation of fine details. To
tackle this limitation, we propose an end-to-end expression-guided generative
adversarial network (EGGAN), which utilizes structured latent codes and
continuous expression labels as input to generate images with expected
expressions. Specifically, we adopt an adversarial autoencoder to map a source
image into a structured latent space. Then, given the source latent code and
the target expression label, we employ a conditional GAN to generate a new
image with the target expression. Moreover, we introduce a perceptual loss and
a multi-scale structural similarity loss to preserve identity and global shape
during generation. Extensive experiments show that our method can manipulate
fine-grained expressions, and generate continuous intermediate expressions
between source and target expressions
Spatio-Temporal Relation and Attention Learning for Facial Action Unit Detection
Spatio-temporal relations among facial action units (AUs) convey significant
information for AU detection yet have not been thoroughly exploited. The main
reasons are the limited capability of current AU detection works in
simultaneously learning spatial and temporal relations, and the lack of precise
localization information for AU feature learning. To tackle these limitations,
we propose a novel spatio-temporal relation and attention learning framework
for AU detection. Specifically, we introduce a spatio-temporal graph
convolutional network to capture both spatial and temporal relations from
dynamic AUs, in which the AU relations are formulated as a spatio-temporal
graph with adaptively learned instead of predefined edge weights. Moreover, the
learning of spatio-temporal relations among AUs requires individual AU
features. Considering the dynamism and shape irregularity of AUs, we propose an
attention regularization method to adaptively learn regional attentions that
capture highly relevant regions and suppress irrelevant regions so as to
extract a complete feature for each AU. Extensive experiments show that our
approach achieves substantial improvements over the state-of-the-art AU
detection methods on BP4D and especially DISFA benchmarks
Size dependent effectiveness of engineering and administrative control strategies for both short- and long-range airborne transmission control
Ventilation is recognized as an effective mitigation strategy for long-range airborne transmission. However, a recent study by Li et al. revealed its potential impact on short-range airborne transmission as well. Our study extends their work by developing size-dependent transmission models for both short- and long-range airborne transmission and evaluates the impact of various control strategies, including ventilation. By adopting a recently determined mode-dependent viral load, we first analyzed the role of different sizes of droplets in airborne transmission. In contrast to models with a constant viral load where large droplets contain more viruses, our findings demonstrated that droplets ranging from ∼2–4 μm are more critical for short-range airborne transmission. Meanwhile, droplets in the ∼1–2 μm range play a significant role in long-range airborne transmission. Furthermore, our study indicates that implementing a size-dependent filtration/mask strategy considerably affects the rate of change (ROC) of virus concentration in relation to both distancing and ventilation. This underscores the importance of factoring in droplet size during risk assessment. Engineering controls, like ventilation and filtration, as well as administrative controls, such as distancing and masks, have different effectiveness in reducing virus concentration. Our findings indicate that high-efficiency masks can drastically reduce virus concentrations, potentially diminishing the impacts of other strategies. Given the size-dependent efficiency of filtration, ventilation has a more important role in reducing virus concentration than filtration, especially for long-range airborne transmission. For short-range airborne transmission, maintaining distance is far more effective than ventilation, and its effectiveness is largely unaffected by ventilation. However, the influence of ventilation on virus concentration and its variation with the distance mainly depend on the specific transmission model utilized. In sum, this research delineates the differential roles of droplet sizes and control strategies in both short- and long-range airborne transmission, offering valuable insights for future size-dependent airborne transmission control measures
CT-Net: Arbitrary-Shaped Text Detection via Contour Transformer
Contour based scene text detection methods have rapidly developed recently,
but still suffer from inaccurate frontend contour initialization, multi-stage
error accumulation, or deficient local information aggregation. To tackle these
limitations, we propose a novel arbitrary-shaped scene text detection framework
named CT-Net by progressive contour regression with contour transformers.
Specifically, we first employ a contour initialization module that generates
coarse text contours without any post-processing. Then, we adopt contour
refinement modules to adaptively refine text contours in an iterative manner,
which are beneficial for context information capturing and progressive global
contour deformation. Besides, we propose an adaptive training strategy to
enable the contour transformers to learn more potential deformation paths, and
introduce a re-score mechanism that can effectively suppress false positives.
Extensive experiments are conducted on four challenging datasets, which
demonstrate the accuracy and efficiency of our CT-Net over state-of-the-art
methods. Particularly, CT-Net achieves F-measure of 86.1 at 11.2 frames per
second (FPS) and F-measure of 87.8 at 10.1 FPS for CTW1500 and Total-Text
datasets, respectively.Comment: This paper has been accepted by IEEE Transactions on Circuits and
Systems for Video Technolog
IterativePFN: True Iterative Point Cloud Filtering
The quality of point clouds is often limited by noise introduced during their
capture process. Consequently, a fundamental 3D vision task is the removal of
noise, known as point cloud filtering or denoising. State-of-the-art learning
based methods focus on training neural networks to infer filtered displacements
and directly shift noisy points onto the underlying clean surfaces. In high
noise conditions, they iterate the filtering process. However, this iterative
filtering is only done at test time and is less effective at ensuring points
converge quickly onto the clean surfaces. We propose IterativePFN (iterative
point cloud filtering network), which consists of multiple IterationModules
that model the true iterative filtering process internally, within a single
network. We train our IterativePFN network using a novel loss function that
utilizes an adaptive ground truth target at each iteration to capture the
relationship between intermediate filtering results during training. This
ensures that the filtered results converge faster to the clean surfaces. Our
method is able to obtain better performance compared to state-of-the-art
methods. The source code can be found at:
https://github.com/ddsediri/IterativePFN.Comment: This paper has been accepted to the IEEE/CVF CVPR Conference, 202
Exploring core mental health symptoms among persons living with HIV: A network analysis
ContextPersons living with HIV (PLWH) commonly experience mental health symptoms. However, little is known about the core mental health symptoms and their relationships.ObjectiveThis study aimed to evaluate the prevalence of various mental health symptoms and to explore their relationships in symptom networks among PLWH.MethodsFrom April to July 2022, we recruited 518 participants through convenience sampling in Beijing, China, for this cross-sectional study. Forty mental health symptoms, including six dimensions (somatization symptoms, negative affect, cognitive function, interpersonal communication, cognitive processes, and social adaptation), were assessed through paper-based or online questionnaires. Network analysis was performed in Python 3.6.0 to explore the core mental health symptoms and describe the relationships among symptoms and clusters.ResultsOf the 40 mental health symptoms, the most common symptoms were fatigue (71.2%), trouble remembering things (65.6%), and uncertainty about the future (64.0%). In the single symptom network, sadness was the most central symptom across the three centrality indices (rS = 0.59, rC = 0.61, rB = 0.06), followed by feeling discouraged about the future (rS = 0.51, rC = 0.57, rB = 0.04) and feelings of worthlessness (rS = 0.54, rC = 0.53, rB = 0.05). In the symptom cluster network, negative affect was the most central symptom cluster across the three centrality indices (rS = 1, rC = 1, rB = 0.43).ConclusionOur study provides a new perspective on the role of each mental health symptom among PLWH. To alleviate the mental health symptoms of PLWH to the greatest extent possible and comprehensively improve their mental health, we suggest that psychological professionals pay more attention to pessimistic mood and cognitive processes in PLWH. Interventions that apply positive psychology skills and cognitive behavioral therapy may be necessary components for the mental health care of PLWH
- …